Goto

Collaborating Authors

 contraction operator



The Realizability of Revision and Contraction Operators in Epistemic Spaces

Sauerwald, Kai, Thimm, Matthias

arXiv.org Artificial Intelligence

This paper studies the realizability of belief revision and belief contraction operators in epistemic spaces. We observe that AGM revision and AGM contraction operators for epistemic spaces are only realizable in precisely determined epistemic spaces. We define the class of linear change operators, a special kind of maxichoice operator. When AGM revision, respectively, AGM contraction, is realizable, linear change operators are a canonical realization.


Unifying Non-Maximum Likelihood Learning Objectives with Minimum KL Contraction

Neural Information Processing Systems

When used to learn high dimensional parametric probabilistic models, the classical maximum likelihood (ML) learning often suffers from computational intractability, which motivates the active developments of non-ML learning methods. Yet, because of their divergent motivations and forms, the objective functions of many non-ML learning methods are seemingly unrelated, and there lacks a unified framework to understand them. In this work, based on an information geometric view of parametric learning, we introduce a general non-ML learning principle termed as minimum KL contraction, where we seek optimal parameters that minimizes the contraction of the KL divergence between the two distributions after they are transformed with a KL contraction operator. We then show that the objective functions of several important or recently developed non-ML learning methods, including contrastive divergence [12], noise-contrastive estimation [11], partial likelihood [7], non-local contrastive objectives [31], score matching [14], pseudo-likelihood [3], maximum conditional likelihood [17], maximum mutual information [2], maximum marginal likelihood [9], and conditional and marginal composite likelihood [24], can be unified under the minimum KL contraction framework with different choices of the KL contraction operators.


Belief Change based on Knowledge Measures

Straccia, Umberto, Casini, Giovanni

arXiv.org Artificial Intelligence

Knowledge Measures (KMs) aim at quantifying the amount of knowledge/information that a knowledge base carries. On the other hand, Belief Change (BC) is the process of changing beliefs (in our case, in terms of contraction, expansion and revision) taking into account a new piece of knowledge, which possibly may be in contradiction with the current belief. We propose a new quantitative BC framework that is based on KMs by defining belief change operators that try to minimise, from an information-theoretic point of view, the surprise that the changed belief carries. To this end, we introduce the principle of minimal surprise. In particular, our contributions are (i) a general information-theoretic approach to KMs for which [1] is a special case; (ii) KM-based BC operators that satisfy the so-called AGM postulates; and (iii) a characterisation of any BC operator that satisfies the AGM postulates as a KM-based BC operator, i.e., any BC operator satisfying the AGM postulates can be encoded within our quantitative BC framework. We also introduce quantitative measures that account for the information loss of contraction, information gain of expansion and information change of revision. We also give a succinct look into the problem of iterated revision, which deals with the application of a sequence of revision operations in our framework, and also illustrate how one may build from our KM-based contraction operator also one not satisfying the (in)famous recovery postulate, by focusing on the so-called severe withdrawal model as an illustrative example.


Probabilistic Contraction Analysis of Iterated Random Operators

Gupta, Abhishek, Jain, Rahul, Glynn, Peter

arXiv.org Artificial Intelligence

In many branches of engineering, Banach contraction mapping theorem is employed to establish the convergence of certain deterministic algorithms. Randomized versions of these algorithms have been developed that have proved useful in data-driven problems. In a class of randomized algorithms, in each iteration, the contraction map is approximated with an operator that uses independent and identically distributed samples of certain random variables. This leads to iterated random operators acting on an initial point in a complete metric space, and it generates a Markov chain. In this paper, we develop a new stochastic dominance based proof technique, called probabilistic contraction analysis, for establishing the convergence in probability of Markov chains generated by such iterated random operators in certain limiting regime. The methods developed in this paper provides a general framework for understanding convergence of a wide variety of Monte Carlo methods in which contractive property is present. We apply the convergence result to conclude the convergence of fitted value iteration and fitted relative value iteration in continuous state and continuous action Markov decision problems as representative applications of the general framework developed here.


SENDER: SEmi-Nonlinear Deep Efficient Reconstructor for Extraction Canonical, Meta, and Sub Functional Connectivity in the Human Brain

Zhang, Wei, Bao, Yu

arXiv.org Artificial Intelligence

Deep Linear and Nonlinear learning methods have already been vital machine learning methods for investigating the hierarchical features such as functional connectivity in the human brain via functional Magnetic Resonance signals; however, there are three major shortcomings: 1). For deep linear learning methods, although the identified hierarchy of functional connectivity is easily explainable, it is challenging to reveal more hierarchical functional connectivity; 2). For deep nonlinear learning methods, although non-fully connected architecture reduces the complexity of neural network structures that are easy to optimize and not vulnerable to overfitting, the functional connectivity hierarchy is difficult to explain; 3). Importantly, it is challenging for Deep Linear/Nonlinear methods to detect meta and sub-functional connectivity even in the shallow layers; 4). Like most conventional Deep Nonlinear Methods, such as Deep Neural Networks, the hyperparameters must be tuned manually, which is time-consuming. Thus, in this work, we propose a novel deep hybrid learning method named SEmi-Nonlinear Deep Efficient Reconstruction (SENDER), to overcome the aforementioned shortcomings: 1). SENDER utilizes a multiple-layer stacked structure for the linear learning methods to detect the canonical functional connectivity; 2). SENDER implements a non-fully connected architecture conducted for the nonlinear learning methods to reveal the meta-functional connectivity through shallow and deeper layers; 3). SENDER incorporates the proposed background components to extract the sub-functional connectivity; 4). SENDER adopts a novel rank reduction operator to implement the hyperparameters tuning automatically. To further validate the effectiveness, we compared SENDER with four peer methodologies using real functional Magnetic Resonance Imaging data for the human brain.


Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations

Basu, Debraj, Data, Deepesh, Karakus, Can, Diggavi, Suhas

arXiv.org Machine Learning

Communication bottleneck has been identified as a significant issue in distributed optimization of large-scale learning models. Recently, several approaches to mitigate this problem have been proposed, including different forms of gradient compression or computing local models and mixing them iteratively. In this paper we propose \emph{Qsparse-local-SGD} algorithm, which combines aggressive sparsification with quantization and local computation along with error compensation, by keeping track of the difference between the true and compressed gradients. We propose both synchronous and asynchronous implementations of \emph{Qsparse-local-SGD}. We analyze convergence for \emph{Qsparse-local-SGD} in the \emph{distributed} setting for smooth non-convex and convex objective functions. We demonstrate that \emph{Qsparse-local-SGD} converges at the same rate as vanilla distributed SGD for many important classes of sparsifiers and quantizers. We use \emph{Qsparse-local-SGD} to train ResNet-50 on ImageNet, and show that it results in significant savings over the state-of-the-art, in the number of bits transmitted to reach target accuracy.


Decrement Operators in Belief Change

Sauerwald, Kai, Beierle, Christoph

arXiv.org Artificial Intelligence

While research on iterated revision is predominant in the field of iterated belief change, the class of iterated contraction operators received more attention in recent years. In this article, we examine a non-prioritized generalisation of iterated contraction. In particular, the class of weak decrement operators is introduced, which are operators that by multiple steps achieve the same as a contraction. Inspired by Darwiche and Pearl's work on iterated revision the subclass of decrement operators is defined. For both, decrement and weak decrement operators, postulates are presented and for each of them a representation theorem in the framework of total preorders is given. Furthermore, we present two types of decrement operators which have a unique representative.


Using Defeasible Information to Obtain Coherence

Casini, Giovanni (University of Luxembourg) | Meyer, Thomas (University of Cape Town)

AAAI Conferences

We consider the problem of obtaining coherence in a propositional knowledge base using techniques from Belief Change. Our motivation comes from the field of formal ontologies where coherence is interpreted to mean that a concept name has to be satisfiable. In the propositional case we consider here, this translates to a propositional formula being satisfiable. We define belief change operators in a framework of nonmonotonic preferential reasoning.We show how the introduction of defeasible information using contraction operators can be an effective means for obtaining coherence.


Kernel Contraction and Base Dependence: Redundancy in the Base Resulting in Different Types of Dependence

Oveisi, Mehrdad (Simon Fraser University) | Delgrande, James P. (Simon Fraser University) | Popowich, Fred (Simon Fraser University) | Pelletier, Francis Jeffry (University of Alberta)

AAAI Conferences

The AGM paradigm of belief change studies the dynamics of belief states in light of new information. Finding, or even approximating, dependent or relevant beliefs to a change is valuable because, for example, it can narrow the set of beliefs considered during belief change operations. Gärdenfors' preservation criterion (GPC) suggests that formulas independent of a belief change should remain intact. GPC allows to build dependence relations that are theoretically linked with belief change. Such dependence relations can in turn be used as a theoretical benchmark against which to evaluate other approximate dependence or relevance relations. There are already some studies, based on GPC, on the parallelism between belief change and dependence. One study offers a dependence relation parallel to AGM contraction for belief sets. Another study links base dependence relation to a more general belief base contraction, saturated kernel contraction. Here we offer yet a more general parallelism between kernel contraction and base dependence. At this level of generalization, different types of base dependence emerge. We prove that this differentiation of base dependence types is a result of possible redundancy in the base. This provides a theoretical means to distinguish between redundant and informative parts of a belief base.